Picture for Zhongfeng Wang

Zhongfeng Wang

Parameter-Efficient Fine-Tuning with Circulant and Diagonal Vectors

Add code
May 01, 2025
Viaarxiv icon

Block Circulant Adapter for Large Language Models

Add code
May 01, 2025
Viaarxiv icon

CDM-QTA: Quantized Training Acceleration for Efficient LoRA Fine-Tuning of Diffusion Model

Add code
Apr 08, 2025
Viaarxiv icon

Anda: Unlocking Efficient LLM Inference with a Variable-Length Grouped Activation Data Format

Add code
Nov 24, 2024
Viaarxiv icon

TaQ-DiT: Time-aware Quantization for Diffusion Transformers

Add code
Nov 21, 2024
Viaarxiv icon

M$^2$-ViT: Accelerating Hybrid Vision Transformers with Two-Level Mixed Quantization

Add code
Oct 10, 2024
Figure 1 for M$^2$-ViT: Accelerating Hybrid Vision Transformers with Two-Level Mixed Quantization
Figure 2 for M$^2$-ViT: Accelerating Hybrid Vision Transformers with Two-Level Mixed Quantization
Figure 3 for M$^2$-ViT: Accelerating Hybrid Vision Transformers with Two-Level Mixed Quantization
Figure 4 for M$^2$-ViT: Accelerating Hybrid Vision Transformers with Two-Level Mixed Quantization
Viaarxiv icon

Efficient Arbitrary Precision Acceleration for Large Language Models on GPU Tensor Cores

Add code
Sep 26, 2024
Viaarxiv icon

NASH: Neural Architecture and Accelerator Search for Multiplication-Reduced Hybrid Models

Add code
Sep 07, 2024
Viaarxiv icon

Co-Designing Binarized Transformer and Hardware Accelerator for Efficient End-to-End Edge Deployment

Add code
Jul 16, 2024
Viaarxiv icon

P$^2$-ViT: Power-of-Two Post-Training Quantization and Acceleration for Fully Quantized Vision Transformer

Add code
May 30, 2024
Viaarxiv icon